102 research outputs found

    Visually Mining the Datacube using a Pixel-Oriented Technique

    No full text
    International audienceThis paper introduces a new technique easing the navigation and interactive exploration of huge multidimensional datasets. Following the pixel-oriented paradigm, the key ingredients enabling the interactive navigation of extreme volumes of data rely on a set of functions bijectively mapping data elements to screen pixels. The use of the mapping from data elements to pixels constrain the computational complexity for the rendering process to be linear with respect to the number of rendered pixels on the screen as opposed to the dataset size. Our method furthermore allows the implementation of usual information visualization techniques such as zoom and pan, anamorphosis and texturing. As a proof-of-concept, we show how our technique can be adapted to interactively explore the Datacube, turning our approach into an efficient system for visual datamining. We report experiments conducted on a Datacube containing 50 millions of items. To our knowledge, our technique outperforms all existing ones and push the scalability limit close to the billion of elements. Supporting all basic navigation techniques, and being moreover flexible makes it easily reusable for a large number of applications

    Visually Mining the Datacube using a Pixel-Oriented Technique

    No full text
    International audienceThis paper introduces a new technique easing the navigation and interactive exploration of huge multidimensional datasets. Following the pixel-oriented paradigm, the key ingredients enabling the interactive navigation of extreme volumes of data rely on a set of functions bijectively mapping data elements to screen pixels. The use of the mapping from data elements to pixels constrain the computational complexity for the rendering process to be linear with respect to the number of rendered pixels on the screen as opposed to the dataset size. Our method furthermore allows the implementation of usual information visualization techniques such as zoom and pan, anamorphosis and texturing. As a proof-of-concept, we show how our technique can be adapted to interactively explore the Datacube, turning our approach into an efficient system for visual datamining. We report experiments conducted on a Datacube containing 50 millions of items. To our knowledge, our technique outperforms all existing ones and push the scalability limit close to the billion of elements. Supporting all basic navigation techniques, and being moreover flexible makes it easily reusable for a large number of applications

    PORGY: a Visual Analytics Platform for System Modelling and Analysis Based on Graph Rewriting

    Get PDF
    PORGY is a visual environment for rule-based modelling based on port graphs and port graph rewrite rules whose application is steered by rewriting strategies. The focus of this demonstration is the visual and interactive features offered by PORGY, which facilitate an exploratory approach to model, simu- late and analyse different ways of applying the rules while recording the model evolution, as well as tracking and plotting system parameters

    Organization of Information for the Web using Hierarchical Fuzzy Clustering Algorithm based on Co-Occurrence Networks

    Get PDF
    International audienceIn this paper, we present a Hierarchical Fuzzy Clustering algorithm which uses domain knowledge to automatically determine the number of clusters and their initial values. The algorithm is applied on a collection of web pages and the results are compared with existing algorithms in the literature

    Topological Decomposition and Heuristics for High Speed Clustering of Complex Networks

    Get PDF
    International audienceWith the exponential growth in the size of data and networks, development of new and fast techniques to analyze and explore these networks is becoming a necessity. Moreover the emergence of scale free and small world properties in real world networks has stimulated lots of activity in the field of network analysis and data mining. Clustering remains a fundamental technique to explore and organize these networks. A challenging problem is to find a clustering algorithm that works well in terms of clustering quality and is efficient in terms of time complexity. In this paper, we propose a fast clustering algorithm which combines some heuristics with a Topological Decomposition to obtain a clustering. The algorithm which we call Topological Decomposition and Heuristics for Clustering (TDHC) is highly efficient in terms of asymptotic time complexity as compared to other existing algorithms in the literature. We also introduce a number of Heuristics to complement the clustering algorithm which increases the speed of the clustering process maintaining the high quality of clustering. We show the effectiveness of the proposed clustering method on different real world data sets and compare its results with well known clustering algorithms

    Identifying the Presence of Communities in Complex Networks Through Topological Decomposition and Component Densities

    Get PDF
    International audienceThe exponential growth of data in various fields such as Social Networks and Internet has stimulated lots of activity in the field of network analysis and data mining. Identifying Communities remains a fundamental technique to explore and organize these networks. Few metrics are widely used to discover the presence of communities in a network. We argue that these metrics do not truly reflect the presence of communities by presenting counter examples. This is because these metrics concentrate on local cohesiveness among nodes where the goal is to judge whether two nodes belong to the same community or vise versa. Thus loosing the overall perspective of the presence of communities in the entire network. In this paper, we propose a new metric to identify the presence of communities in real world networks. This metric is based on the topological decomposition of networks taking into account two important ingredients of real world networks, the degree distribution and the density of nodes. We show the effectiveness of the proposed metric by testing it on various real world data sets

    Evaluating the Quality of Clustering Algorithms using Cluster Path Lengths

    Get PDF
    International audienceMany real world systems can be modeled as networks or graphs. Clustering algorithms that help us to organize and understand these networks are usually referred to as, graph based clustering algo- rithms. Many algorithms exist in the literature for clustering network data. Evaluating the quality of these clustering algorithms is an impor- tant task addressed by different researchers. An important ingredient of evaluating these clustering techniques is the node-edge density of a clus- ter. In this paper, we argue that evaluation methods based on density are heavily biased to networks having dense components, such as social net- works, but are not well suited for data sets with other network topologies where the nodes are not densely connected. Example of such data sets are the transportation and Internet networks. We justify our hypothesis by presenting examples from real world data sets. We present a new metric to evaluate the quality of a clustering algorithm to overcome the limitations of existing cluster evaluation techniques. This new metric is based on the path length of the elements of a cluster and avoids judging the quality based on cluster density. We show the effectiveness of the proposed metric by comparing its results with other existing evaluation methods on artificially generated and real world data sets

    PORGY : réécriture et visualisation de graphes dynamiques

    Get PDF
    International audienceCet article présente les premiers résultats sur la visualisation et la manipulation interactive d'un système de réécriture de graphes. Nous sommes amenés à nous pencher sur la visualisation de graphes dont la topologie évolue au cours du temps selon des modifications dictées par des règles de réécriture. Le système doit non seulement de montrer le graphe qui évolue au cours du temps, mais il apparaît ici comme un atelier complet permettant d'étudier le système de réécriture lui-même. Il s'agit d'amener la visualisation en appui à l'étude du système de réécriture pour permettre de comprendre son comportement et d'identifier ses propriétés comme par exemple la convergence des calculs ou les configurations bloquantes. Outre les questions relatives au dessin des graphes qui font encore l'objet de travaux, le système se penche sur les problèmes d'identification de motifs et d'historique des réécritures. Nous abordons aussi les questions plus techniques relatives à la structure interne du système

    Dessin de graphe assisté par un algorithme génétique

    Get PDF
    National audienceInteractive visualization interfaces for graph are an interesting perspective for data analysis, and by extension for decision support system. The aim of this project is to assist user for drawing graphs. Currently, one of the main difficulties for the user is to choose the best fitted algorithm for his graph. Indeed, there are lots of different algorithms and few of them are easy to use. The suggested solution is to generate different correct drawings for the same graph. Those drawings are generated by a modified force-directed placement algorithm for which parameters are set vertex by vertex. Parameters set are given by a genetic algorithm. English version : Interactive visualization interfaces for graph are an interesting perspective for data analysis, and by extension for decision support system. The aim of this project is to assist user in drawing graphs. Currently, one of the main difficulties for the user is to choose the best fitted algorithm for his graph. Indeed, there are lots of different algorithms and few of them are easy to use. The suggested solution is to generate different correct drawings for the same graph. Those drawings are generated by a modified force-directed placement algorithm for which parameters are set vertex by vertex. Parameter sets are given by a genetic algorithm. This article presents a proof of concept that this modified algorithm and the associated genetic algorithm are able to reproduce highly constrained drawing (parallel edges or right-angled) in a very different way that what force directed placement algorithms do. The different highlighted points in this article are the mass-spring system and its modification, the genetic algorithm, the similarity method used to evaluate drawing and the proof of concept of the method.Les interfaces de visualisation interactive de graphes représentent aujourd'hui une perspective intéressante pour l'analyse de données, et par extension pour les systèmes d'aide à la décision. Le but de ce projet est d'assister un utilisateur novice dans le cadre du dessin de graphe. Actuellement, une des principales difficultés pour l'utilisateur consiste à choisir l'algorithme de dessin qui conviendra le mieux à son graphe. En effet, il existe un très grand nombre de méthodes possibles et toutes ne sont pas facilement accessibles. La solution envisagée consiste à fournir automatiquement plusieurs dessins viables d'un même graphe. Ces dessins sont générés par un algorithme par modèle de force (système masse-ressort) modifié afin d'être paramétrable sommet par sommet. Les jeux de paramètres sont fournis par un algorithme génétique. Cet article présente principalement une preuve de concept de la possibilité d'utiliser un tel processus pour dessiner tout type de graphe, et plus particulièrement des graphes fortement contraints (angles droits ou parallélismes). Les points abordés sont le modèle masse-ressort utilisé et les modifications qui lui ont été apportées, les caractéristiques principales de l'algorithme génétique mis en œuvre, la métrique de similarité permettant l'évaluation des dessins générés au cours de l'apprentissage et enfin le cas d'application proposé comme preuve de concept
    • …
    corecore